Chunking with WPDV Models
نویسنده
چکیده
منابع مشابه
Weighted probability distribution voting, an introduction
This paper introduces a new machine learning technique, Weighted Probability Distribution Voting (WPDV). During learning, WPDV determines the output class probability distribution for each input feature, both atomic and complex. During classiication, WPDV takes all input features that occur in the new input and adds the corresponding probability distributions , each multiplied by a weight facto...
متن کاملChinese Chunking Based on Maximum Entropy Markov Models
This paper presents a new Chinese chunking method based on maximum entropy Markov models. We firstly present two types of Chinese chunking specifications and data sets, based on which the chunking models are applied. Then we describe the hidden Markov chunking model and maximum entropy chunking model. Based on our analysis of the two models, we propose a maximum entropy Markov chunking model th...
متن کاملA Default First Order Family Weight Determination
Weighted Probability Distribution Voting (WPDV) is a newly designed machine learning algorithm, for which research is currently aimed at the determination of good weighting schemes. This paper describes a simple yet eeective weight determination procedure, which leads to models that can produce competitive results for a number of NLP classiication tasks. 1 The WPDV algorithm Weighted Probabilit...
متن کاملAn Empirical Study of Vietnamese Noun Phrase Chunking with Discriminative Sequence Models
This paper presents an empirical work for Vietnamese NP chunking task. We show how to build an annotation corpus of NP chunking and how discriminative sequence models are trained using the corpus. Experiment results using 5 fold cross validation test show that discriminative sequence learning are well suitable for Vietnamese chunking. In addition, by empirical experiments we show that the part ...
متن کاملAn Empirical Study of Chinese Chunking
In this paper, we describe an empirical study of Chinese chunking on a corpus, which is extracted from UPENN Chinese Treebank-4 (CTB4). First, we compare the performance of the state-of-the-art machine learning models. Then we propose two approaches in order to improve the performance of Chinese chunking. 1) We propose an approach to resolve the special problems of Chinese chunking. This approa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000